Lagrangian reconstruction of cosmic velocity fields 
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We discuss a Lagrangian reconstruction method of the velocity field from galaxy redshift catalog 
that takes its root in the Euler equation. This results in a "functional" of the velocity field which 
must be minimized. This is helped by an algorithm solving the minimization of cost-flow problems. 
The results obtained by applying this method to cosmological problems are shown and boundary 
effects happening in real observational cases are then discussed. Finally, a statistical model of the 
errors made by the reconstruction method is proposed. 

PACS numbers: 47.10.A-,47.15.km,47.11.Fg,95.35.+d,98.62.Py 



INTRODUCTION 



Cosmologists are highly interested in studying galaxy 
peculiar velocities. Indeed, their study is a direct way 
to measure the dynamical state of a system and would 
thus permit to better understand dark matter distribu- 
tion in our local Universe. The main difficulty is that 
measured velocities are only available sparsely and hence 
does not provide a good probe of the matter distribu- 
tion. One must then devise an algorithm that is able to 
predict, under fair hypotheses, galaxy peculiar velocities 
from their present positions, which are their sky coordi- 
nates and their redshift, and compare the result to the 
measurement. Jim Peebles [1] first tried to do full orbit 
reconstruction by evolving the present system back in 
time. This method proved to be quite accurate for very 
small volume and number of objects. However, whenever 
one tries to reconstruct orbits of a large number of galax- 
ies, the method fails because the number of plausible so- 
lution is blowing up. A simplification of this problem is 
presented: 3D galaxy positions are assumed to be known 
and a simpler gravitational dynamic model is going to 
be assumed. We will also assume that the dynamics of 
galaxies is mostly driven by collisionless dark matter par- 
ticles. 

This proceeding is organized as follows. In S HH we 
recall the principal result of the reconstruction method 
developed by [2] (see also the companion paper Mohayaee 
& Sobolevskii, hereafter MS). The method requests the 
using of a special fast algorithm to solve the problem. 
This algorithm is presented in § lllli The method is then 
applied to a dark matter distribution obtained from a 
cosmological simulation and the reconstructed velocities 
are checked against the simulated ones (§ lIVp . Finally, 
a discussion on problems with bad boundary conditions, 
as usually met in observational cosmology, is quickly dis- 
cussed in ? IVl 



II. VELOCITY RECONSTRUCTION THEORY 

The theory of velocity reconstruction in cosmology is 
detailed by MS. We recall here the main results. To re- 
construct the peculiar velocity field one must first com- 
pute the displacement field of dark matter particles by 
solving a Monge- Ampere equation [Eq.(16) of MS]. We 
achieve that by minimizing Eq.(17) of MS in its simplified 
form using the "Auction" algorithm, with a the pairing 
map and fi the mass of each particles of the mesh: 



N 



(1) 



The minimization is conducted over a. We recall also the 
Zel'dovich approximation [Eq.(12) of MS] for the velocity 
field is taking the following form 



v(Xi) = /3(Xi - 



(2) 
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where 13 is the linear growth factor, which is well ap- 
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proximated hy (3 c^i l^m when it is computed at redshift 
z = 0. 



III. MINIMIZATION ALGORITHM 

Direct minimization of Eq. ([T]) is a computationally 
difficult problem [time complexity 0{N\)]. Fortunately, 
there exist better alternatives that have been developped 
for solving minimal cost flow problems which can be 
adapted to our minimal transportation problem. In par- 
ticular, we are going to use the "Auction" algorithm de- 
veloped in [3|. The time complexity of this algorithm is 
of the order of 0(n^'^^) by direct performance measure- 
ment, with n the particle density [7]. The exact constant 
hidden in 0(n^-^^) depends a lot on the difficulty of the 
assignment problem, which means it is catalog depen- 
dent. 
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FIG. 1: Application to Cosmology - Top left: A slice of the 
density field of the ACDM simulation that is used for the 
tests (shades of gray indicates logarithm of the mass density) . 
Top right: Adaptively smoothed line-of-sight component of 
the velocity field in the same slice. Bottom right: MAK re- 
constructed line-of-sight component of the velocity field of 
the same slice. Linear color scale: dark blue=-1000 km s~^, 
white=+1000 km s""*^. Bottom left: Scatter plot between re- 
constructed and simulated velocities for objects identified in 
the simulation. Shades of grey show levels of the logarithm 
of the point density. 



A. Auction algorithm 

The algorithm tries to evolve the pairing map a be- 
tween and qj such that when the function is station- 
ary between two consequent iteration it corresponds to 
minimizing the given total association cost. Particles lo- 
cate at different Eulerian positions Xi compete against 
each other for Lagrangian positions qj. Minimization of 
the total association cost Sa is achieved by studying the 
dual problem of minimization of association penalities pj . 
In H it is shown that 



min Sa 



(3) 



with aij = /i(xi — qj)^, the cost of associating to q^- 
and ri = mmj{aij -\-pj). Once the set {pj} is determined 
by the above maximization, the map a is simply given 
by: 



one, we obtain a set of best assignment A{j) for each 
particle q^ by minimizing all possible r^. Then, we link 
x^* to qj with z* being the particle having the minimal 
Ti* in the set A{j). We also have a reverse mapping for 
this link that we write j*. Finally, the penality pj is 
updated such that 



Pj Pj 



with e > and 



Wi = min {aij . 



(5) 



(6) 



a{i) = arg min {aij -\- Pj} • 



(4) 



The solution found is the same as for e = provided 
e < eo/N, with 

eo = min a^.-. (7) 

{i,j}/a,,,^0 

The time complexity depends quite a lot on the way e is 
scaled down from its initial value to the cq/N. Numerical 
experiments have shown that trying to converge in about 
5 iterations and starting from e/eo — N/2 seems to give 
a faster convergence. 



B. Implementation 

We developped a C++ multithreaded (shared mem- 
ory parallelism) and MPI version of the "Auction" 
algorithm, it will be available later as a multi- 
purpose library for cost-flow problems at the address 
'http : //www . iap . f r/users/lavaux/, Besides doing a 
full minimization over all q^- for a given x^ ("dense" 
mode). It also supports a "sparse" mode that solves a 
partial minimization problem: for a given x^, it only min- 
imizes over a subset of {qj} such that ||x^ — qj||cx) 
where is a parameter given at the initialization to the 
algorithm. This allows to reduce drastically the comput- 
ing time while giving the same result provided that R is 
not too small (typically R = 40 h'^ Mpc for a ACDM 
Universe). On a Dual-core AMD Athlon64 4800+ , the 
SMP implementation (dense mode) takes 50 mins to as- 
signing 79,000 particles. It has successfully reconstructed 
a 128^ dense mesh in a month in sparse mode. The MPI 
version of the corresponding algorithm is only performant 
for larger number of particles (typically N > 500,000). 
Most of the time is, at the moment, spent at comput- 
ing miuj {aij -\-pj) as the cost values are only kept in 
a minimalistic cache. Precomputing the costs is also 
not feasible because of the excessive amount of memory 
that would be needed to store all costs for all (i^j) pairs. 
We also consider to implement a general purpose totally 
asynchronous implementation in the near future. 



IV. APPLICATION TO COSMOLOGY: TEST 
ON COSMOLOGICAL SIMULATION 



Effectively, {pj} is computed iteratively by the algorithm. 
Each iteration is composed of two parts. During the first 



To check that the dark matter dynamical model is 
working, we are testing it against a 128^ A^-body sample 





FIG. 2: Cosmology / Multi- streaming regions - This figure 
illustrates the different problems that may occur for a halo of 
dark matter particles near a cluster of galaxies. Galaxy A is 
in the region of first infall. The displacement field will be well 
reconstructed. Galaxy B is coming from the same direction 
as Galaxy A but has already gone through the center of the 
cluster and is decelerating. In that case, its displacement 
is badly reconstructed as, most likely, MAK predicts that 
the matter composing Galaxy B is coming from the region 
opposite to Galaxy A's region. Galaxy C is also wrongly 
reconstructed. 



0] which was generated with the public version of the A^- 
body code HYDRA [5] to simulate collisionless structure 
formation in a standard ACDM cosmology. The volume 
of the simulation is 200^h~^ Mpc^. The mean matter 
density is ftrn = 0.30 and the cosmological constant is 
Qa = 0.70. The Hubble constant is Hq = 65 km s~^ 
Mpcand the normalization of the density fluctuations in 
a sphere of radius 8 Mpc is as = 0.99. 

Haloes of dark matter particles are identified using a 
friend-of-friend algorithm with a traditional value of the 
linking parameter / = 0.2 times the mean particle separa- 
tion. A limit of 5 linked particles is put to bind particles 
into a halo. The particles left unbound by this criterion 
were kept in a set called the "background field". All 
objects are kept in a mock catalogue called FullMock. 
We have run a reconstruction on FullMock using a MAK 
mesh with 128^ elements. Each object of FullMock was 
given a number of elements equal to the number of 
particles of the original simulation which has been bound 
into this object. We distributed the mesh elements 
regularly on a cubic grid of the same physical size as the 
simulation box. Finally we computed the convex map- 
ping a corresponding to the MAK problem with the help 
of the algorithm described in § 11111 The velocities for each 
particle were computed using the Zel'dovich approxima- 
tion Eq. ([2|), using the same cosmology as the simulation 
to compute D{t). 

Fig. [U summarizes the results obtained using the MAK 
method on the reconstructed velocities. The individual 
object velocities, in the bottom-left panel, are exception- 
ally well reconstructed. Visual inspection of the line- 



FIG. 3: Cosmology / Boundary problems - Left panel: Illus- 
tration of the NaiveDom approach to handle boundary prob- 
lems while doing a reconstruction. The dark starry ball illus- 
trates the current dark matter distribution as inferred from 
galaxy catalogeus. The whitish transparent ball is the as- 
sumed initial volume for the dark matter that has fallen in 
present structures. Right panel: Same as left panel but this 
illustrates the PaddedDom approach. 

of-sight component of the velocity field in the two right 
panels show nearly no discrepancy except in regions with 
really high velocities. In these regions, the dynamics is 
highly non-linear, which means that the convex hypoth- 
esis is not valid anymore. This problem arises on a typi- 
cal cosmological scale of at most a few Mpc around large 
clusters. Indeed, in those regions the fluid description of 
dark matter particles completely fail because the mass 
tracers may have already crossed the center of the grav- 
itational attractor and are currently falling back to the 
center, as illustrated Fig. [2l This renders the displace- 
ment field reconstruction dubious in those cases. 



V. APPLICATION TO COSMOLOGY: 
BOUNDARY PROBLEMS 

One does not necessarily know the Lagrangian domain 
q on which the MAK reconstruction must be computed. 
This is the case for real cosmological observations and 
one must use some empirical prescription to attenuate 
the boundary effects on reconstructed velocities. This 
scheme is helped by the overall homogeneity of the Uni- 
verse above scales larger than 200 Mpc. We propose 
thus to check two schemes to handle boundary effects: 

- A naive approach would be to assume that the 
piece of Universe considered has not changed its 
volume sufficiently between initial time and the cur- 
rent time. This means that we may assume that if 
we select a ball of matter, in the Universe, centered 
on us, all the mass that is inside this ball is coming 
from the same homogeneous ball in the Universe as 
it was at decoupling time. We call this approach 
NaiveDom. It is equivalent to say that tidal field 
effects are totally negligible on the considered scale. 

- An alternative approach is not to make an assump- 
tion on the exact shape but on the low amount of 
fluctuation on the boundary. Consequently, if one 
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FIG. 4: Cosmology / Boundary problems - Outer boundary 
problems while doing reconstruction on finite volume cat- 
alogue. Color scale is the same everywhere (dark blue=- 
1000 km/s, white=+1000 km/s). Top left: Density field of 
the mock catalogue (log scale). Top right: Simulated velocity 
field, smoothed with a 5 Mpc Gaussian window. Low 
left: PaddedDom velocity field, smoothed equally. Low right: 
NaiveDom velocity field, smoothed equally. 



VI. STATISTICAL ANALYSIS OF ERRORS IN 
THE RECONSTRUCTION 

The measurement of the slope between velocities and 
reconstructed displacements should give an estimation of 
rtrn- However, building a reliable estimator of this slope 
without the statistical model of errors made both at the 
observation and the reconstruction level may produce un- 
accepable bias. We propose to show how to use models 
on reconstruction errors to make a bayesian analysis of 
the reconstructed velocities. We will focus here on er- 
rors made during a reconstruction and assume that the 
observed peculiar velocities v are equal to their true ve- 
locities. A more detailed discussion can be found in 

Using simulations, we have measured the distribution 
of reconstruction errors, for each object z of a catalog of 
galaxy, {e^} defined as 



(8) 



with /3 = 0.51 for the studied simulation (corresponding 
to Qrn = 0.30), Vr the line-of-sight component of the sim- 
ulated velocity of the considered, V^r,rec the reconstructed 
radial displacement. The result is given in Fig. [5l We 
have tried to fit an histogram of the errors {e^} by both 
a Gaussian function of width B 



(9) 



fcie) (X exp ( - — 



and a Lorentzian function 



(10) 



selects the same ball of matter in the present Uni- 
verse, it is fair under this approximation to pad 
the matter distribution using homogeneously dis- 
tributed particles. One may then build the map- 
ping between the "padded piece of Universe" and 
an initial completely homogeneous set of particles. 
We call this approach PaddedDom. 



These two ways of handling boundary effects are illus- 
trated Fig. [3] and the results are presented in Fig. [H 

As expected, boundaries are badly reconstructed in 
PaddedDom and NaiveDom. However at the center of 
the spherical cut, the velocity field seems correctly re- 
constructed by visual comparison to the velocity field 
computed from the simulation. Looking carefully at the 
result using NaiveDom indicates that there is likely a sys- 
tematic error near the center (the blue region is darker 
and more extended than in the two other figures). This 
is probably due to stronger boundary effects that are not 
correctly attenuated by the NaiveDom scheme (a detailed 
quantitative analysis of boundary artefacts are given in 
[6]). Empirically, we found that a buffer zone of, at least, 
about 20 Mpc is needed to reduce boundary effects 
with a PaddedDom reconstruction scheme. 



We obtained approximately the same width B for the 
two fits (which is expected from the second order devel- 
opment of both functions), however it is striking that /l 
is a much better approximation than fc to the observed 
error distribution. 

We equate the probability of getting an error e on the 
true velocity for an object of the catalog to /L(e). We 
also assume now that the distribution of velocities in the 
object sample is, for a sufficiently large volume, Gaussian 
with a width a„: 



P{Vr\(Ty) (X exp ( . 



(11) 



Now we can build the joint probability of getting v^^ V^r,] 
and P: 

(X P{e{Vr, ^r,rec)|^, (Ty)P{Vr\B, (J^ )P(V^r,rec | ^, CFy) 



(X P(V^r,rec|^,Cr^)- 



1 



V B ) 



, (12) 



where the constant of proportionality eventually depends 
on B^ ay and p. Using the theorem of Bayes, it is now 
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FIG. 5: Error in the reconstruction - This plot displays the 
probability distribution of the quantity 1^r,rec — 't^r,sim5 where 
and are the line-of-sight reconstructed and sim- 

ulated velocities, respectively, after choosing an observer at 
the center of the simulation box. The dashed and dot-dashed 
curve give the best fit of a Gaussian and a Lorentzian distri- 
bution, respectively. 

possible to compute the conditional probability that the 
true velocity of some object is given that the recon- 
structed displacement is V^r,rec- 

e"^ (^1 + 

To obtain the total likelihood to observe true veloci- 
ties {vi^^} given that the reconstructed displacements are 
{'0i,r,rec}7 o^c may assumc the statistical independence 
of the ('^i,r, '0i,r,rec) ducts. With this assumption, £ is 
simply 

r, rec, (3, B ,crv ) (14) 

i 



Using that approach we have made measurements in fi- 
nite volume mock catalogs. For example, with a Padded- 
Dom reconstruction, one measure = 0.34 with this 
approach (for an effective ftrn = 0.35 in this catalog), 
whereas a naive measurement would yield Q^n < 0.26. 



VII. CONCLUSION 

We presented a method to predict velocities of galax- 
ies from their current position. To solve this problem, we 
implemented a fast algorithm invented by Dimitri Bert- 
sekas [3] and applied the method to a pure dark matter 
simulation. It happens that the reconstructed velocities 
are impressively accurate on large-scales (§ lIVp . How- 
ever, the solution is only approximate in regions where 
multi-streaming occurs. 

We proposed two methods for partially correcting 
boundary effects (§[Yj) and showed how boundary effects 
affect the reconstructed velocity field. We preferred the 
PaddedDom reconstruction scheme as it seems to give 
overall better results. Empirically we found that a buffer 
zone of 20 Mpc is needed before obtaining a recon- 
structed velocity field correlated with the one given by 
the simulation. 

At last, we proposed a bayesian model (§ \VJ^ to ac- 
count for reconstruction errors while estimating the slope 
between the reconstructed displacements and the true ve- 
locities of objects in a galaxy catalogs. 

We would like to continue this work by improving the 
padding schemes to have even less boundary effects and 
make full use of available data in astronomy. We are also 
working on an improved algorithm that is able to take 
into account in a better way the non-linearities that are 
introduced in the velocity field due to gravitational effects 
occuring along particle trajectories. This new algorithm 
will try to fully solve the Euler-Poisson problem. [8] 

This work is partially supported by the ANR grant 
BLAN07-2_183172 (project OTARIE). 
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